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Abstract 

Of  interest  here  is  the  problem  of  fitting  a curve  or  surface  to  given  data  by  minimizing 
some  norm  of  the  distances  from  the  points  to  the  surface.  These  distances  may  be 
measured  orthogonally  to  the  surface,  giving  orthogonal  distance  regression,  and  for  this 
problem,  the  least  squares  norm  has  attracted  most  attention.  Here  we  will  look  at  two 
other  important  criteria,  the  h norm  and  the  Chebyshev  norm.  The  former  is  of  value 
when  the  data  contain  wild  points,  the  latter  in  the  context  of  accept/reject  criteria. 
There  are  however  circumstances  when  it  is  not  appropriate  to  force  the  distances  to 
be  orthogonal,  and  two  possibilities  of  this  are  also  considered.  The  first  arises  when 
the  distances  are  aligned  with  certain  fixed  directions,  and  the  second  when  angular 
information  is  available  about  the  measured  data  points.  For  the  least  squares  norm,  we 
will  consider  some  algorithmic  developments  for  these  problems. 


1 Introduction 

Of  interest  here  is  the  problem  of  fitting  to  given  data  a curve  or  surface  which  depends  on 
a vector  a G i?"  of  parameters.  The  underlying  approach  is  such  that  (1)  a point  on  the 
surface  is  associated  with  each  data  point,  (2)  the  fit  of  the  surface  is  measured  by  a norm 
of  the  vector  whose  components  are  the  distances  between  each  pair  of  corresponding 
points,  (3)  the  (correct)  Gauss-Newton  steps  in  a are  used  as  a basis  for  minimizing 
this  norm.  The  distances  may  be  orthogonal  to  the  surface,  giving  orthogonal  distance 
regression  (ODR),  or  may  be  forced  to  satisfy  some  other  criterion  which  makes  them 
non-orthogonal  in  general.  We  consider  both  situations. 

For  the  ODR  problem,  most  attention  has  been  given  to  the  least  squares  norm  (eg 
[5],  [8],  [9],  [16],  [17],  [22]).  Here  we  will  look  at  two  other  important  criteria,  the  l\  norm 
and  the  Chebyshev  norm.  The  former  is  of  value  when  the  data  contain  wild  points,  the 
latter  in  the  context  of  accept/reject  criteria.  For  the  non-orthogonal  distance  problem 
we  will  restrict  attention  to  the  least  squares  case. 

In  terms  of  a vector  a € i?"  of  parameters,  the  curve  or  surface  may  be  defined  in 
two  ways,  (a)  parametrically,  when  a point  x on  the  surface  is  given  by 

x = x(a,t). 
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with  t the  parameters  whose  values  define  the  particular  point,  or  (b)  implicitly,  when 
the  surface  is  defined  by  the  set  of  points  x satisfying  the  scalar  equation 


/(a,x)=0. 

It  is  also  assumed  here  that  the  expressions  required  in  these  representations  are  differ- 
entiable functions  of  their  parameters. 

2 h and  loo  ODR 

Consider  first  the  h case.  Then  the  problem  is 

m 

minimize  ||xj  — Zj(a)||, 
i=l 

where  the  points  Zi(a)  are  the  nearest  points  to  Xj  on  the  surface  defined  by  a,  and 
where  we  will  assume  throughout  that  unadorned  norms  are  Euclidean  norms.  Let 

= ||xi  - Zj(a)i|,  i m. 

Then  the  problem  is  effectively  now  defined  in  terms  of  the  vector  a alone.  It  is  easy  to 
to  calculate  the  correct  Gauss-Newton  step  in  a,  which  minimizes 

P + Va(Sd||i 


with  respect  to  d.  Now 

Va<5i  = - 5i  ^ 0, 

so  that  there  are  potential  problems  if  any  6i  — » 0.  Given  the  nature  of  the  h prob- 
lem, we  cannot  exclude  that  possibility.  In  fact  although  6 is  not  a smooth  function, 
because  derivative  discontinuities  only  occur  at  zero  values  it  is  a strong  semi-smooth 
function,  as  defined  in  [12].  Ideas  from  smooth  analysis  and  from  strong  semi-smooth 
analysis  as  developed  in  [11]  can  then  be  combined  to  give  a local  convergence  analysis 
for  the  present  problem.  Fast  local  convergence  for  the  usual  smooth  problem  relies  on 
strong  uniqueness  [4];  for  the  l\  norm,  this  can  be  interpreted  in  terms  of  a requirement 
that  the  sequence  of  solutions  d*'  is  ‘Svell-behaved”  in  a certain  sense  [1].  An  analogous 
requirement  can  be  stated  here. 

Let  the  current  approximation  be  a!^  and  let  denote  the  Jacobian  matrix  VaJ(a*^), 
assuming  this  exists.  Then  the  Gauss-Newton  step  d*'  minimzes 

||J(a'')  + j''d]li. 

It  is  well  known  (see  for  example  [18])  that  if  J'^  has  full  rank  then  there  always  exists 
a solution  d*^  and  an  index  set  containing  n indices  such  that 

Ji(a^)  + efj'=d''  = 0, 

where  e*  is  the  ith  coordinate  vector.  Let  a*  be  a limit  point  of  the  iteration.  Then  for 
a^  close  enough  to  a* , assume  that  J’^  exists  and 
(i)  J(a^)  -|-  J^d^  has  exactly  n zeros,  corresponding  to  an  index  set 
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(ii)  = Z*,  independent  of  k, 

(iii)  the  nxn  matrices  whose  rows  are  ef  J^, i G Z*,  are  bounded  away  from  singularity. 
In  practice  these  conditions  ensure  that  d*  is  unique,  and  there  is  no  redundancy  in  the 
zero  components.  An  analysis  is  given  in  [21]  for  both  parametric  and  implicit  fitting. 
The  main  result  is  the  following. 

Theorem  2.1  [21]  Let  the  Gauss-Newton  method  produce  a sequence  — > a*,  where 

(5(a^)  has  no  zero  components,  and  let  (i)-(iii)  above  hold.  In  the  parametric  case,  assume 
that  for  all  i G Z* , there  exists  a unique  unit  normal  vector  n,;  (up  to  change  of  sign)  at 
the  point  Xj  on  the  surface  defined  by  a*.  Then  the  (undamped)  Gauss-Newton  method 
converges  to  a*  at  a second  order  rate. 

The  significance  of  this  result  is  that,  for  both  parametric  and  implicit  fitting,  any 
Si  tending  to  zero  is  not  by  itself  necessarily  an  obstacle  to  good  performance  of  the 
Gauss-Newton  method  in  the  h case.  What  is  more  significant  is  the  possibility  of  very 
slow  convergence  and  this  has  more  to  do  with  the  number  of  those  zero  components 
of  5 at  a limit  point,  rather  than  just  their  presence.  A fundamental  requirement  for 
the  condition  (ii)  is  that  the  number  of  zero  components  of  <5(a*)  is  n.  Of  course,  this 
condition  is  a rather  special  one,  and  for  many  problems,  will  not  be  satisfied.  There  is 
slow  (possibly  very  slow)  convergence  associated  with  this  case. 

Turning  now  to  the  ioo  problem,  this  can  be  stated 

minimize  max ||xj -z,:(a)||, 

i 

with  Zj(a)  defined  as  before.  Again  Si  — ||xj  — Zi(a)||  is  not  a smooth  function,  but  a 
solution  normally  occurs  in  a region  where  S is  smooth.  Therefore  the  problem  does  not 
differ  significantly  from  the  usual  nonlinear  minimax  problem:  the  main  requirement  for 
fast  local  convergence  is  that  at  a limit  point  the  norm  is  attained  at  n -t  1 indices  [4]. 

Two  simple  examples  in  2 dimensions  are  given  by  way  of  illustration.  A standard 
line  search  is  incorporated  to  force  global  convergence,  although  trust  region  methods 
are  a popular  alternative.  Indeed,  local  convergence  is  the  main  concern  here,  and  we 
have  not  begun  to  address  important  issues  to  do  with  the  development  of  robust  general 
purpose  algorithms. 

Example  2.2  Consider  the  Spath  data  set  [13]  (m  = 7),  and  consider  fitting  an  ellipse 
defined  implicitly,  using  the  loo  and  h norms.  The  solutions  are  illustrated  in  Figure  1, 
where  the  dashed  ellipse  and  dashed  lines  are  the  loo  solution  and  corresponding  ortho- 
gonal directions,  and  the  solid  ellipse  and  solid  lines  are  the  h solution  and  corresponding 
directions.  Both  ellipses  were  obtained  using  the  Gaus.s-Newton  method  starting  from 
the  circle  centre  (5,5),  radius  2,  in  4 and  5 iterations  respectively  for  5 figure  accuracy. 

Example  2.3  Consider  next  the  GGS  data  set  [6],  which  has  m = 8.  Similar  fits  to 
those  for  Example  1 are  shown  in  Figure  2.  Again  the  Gauss-Newton  method  was  used 
starting  from  the  circle  centre  (5,5),  radius  2,  to  give  convergence  in  6 iterations  {loo) 
and  7 iterations  Qi). 

For  both  these  examples  n = 5,  and  favourable  conditions  hold  so  that  there  is 
quadratic  convergence  both  in  the  h and  loo  cases.  Otherwise,  the  key  to  recovering  fast 
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Fig.  1.  lx  and  l^o  fits  to  Spath  data  set. 


local  convergence  in  the  h case  is  to  identify  Z*  and  to  reformulate  the  problem  locally 
as 


minimize  ^ ||xi  - Zi(a)||  subject  to  Xj  - Zj (a)  = 0,  ii  G Z*.  (2.1) 

A similar  remedy  in  the  loc  case  is  as  follows.  For  a limit  point  a*  of  the  iteration,  let 

I*  = {i  : 6i{a*)  = max5j(a*)}. 

i ' 

Then  if  we  can  identify  I*,  a*  solves,  for  any  j £ I*: 

minimize  5j(a)  subject  to  ^i(a)  — Sj{a)  = 0,  i G I*\j- 

Example  2.4  Fitting  an  loo  ODR  line  in  to  100  random  data  points  (equivalent 
to  finding  the  circumscribing  cylinder  of  smallest  radius)  gives  slow  convergence  of  the 
basic  method,  because  |/*|  = 3 and  n = 4.  But  once  we  identify  I*  = {4, 42, 58},  only  5 
iterations  of  the  NAG  Fortran  subroutine  E04UCF  are  required  for  6 figure  accuracy. 

3 Non-orthogonal  I2  distance  regression 

3.1  Using  fixed  directions 

Suppose  that  the  data  come  from  sampling  the  surface  of  a manufactured  part,  using 
a coordinate  measuring  machine  with  a touch  probe.  It  has  been  argued  by  Hulting 
[10]  that  choosing  the  directions  to  be  the  known  probe  directions  (relative  to  a 
fixed  frame  of  reference)  not  only  makes  explicit  use  of  the  measurement  design,  but 
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Fig.  2.  li  and  loo  fits  to  GGS  data  set. 


also  complies  with  traditional  fixed-regressor  assumptions  (enabling  standard  inference 
theory  to  apply). 


Let  Xj,  i = as  usual  be  the  data  points,  and  let  Zj  be  the  corresponding 

points  on  the  surface  reached  by  travelling  along  the  lines  from  x,-  in  the  direction  Vj. 
Then  we  require  to  minimize  ||(5||  where 

Si  = ||x,-  -z,:(a)||,  i = 


with  Zj(a)  defined  by 


Zj(a)  - Xf  = SiVi,  i = 


where  Vj  satisfying  vfvj  = 1 is  given  for  each  i.  In  case  of  ambiguity,  the  smallest  value 
of  Si  is  chosen.  The  basic  idea  in  efficient  algorithmic  development  is  again  to  treat  the 
problem  as  one  in  a alone,  which  can  be  solved  as  before  by  the  Gauss-Newton  method 
(or  variants).  Let  a be  given.  Then  for  each  point  Xi,  the  point  where  the  line  through 
Xj  in  the  direction  Vj  first  cuts  the  surface  can  be  obtained  (this  calculation  replaces  the 
“footpoint  problem”  of  calculating  z,(a)  as  the  point  on  the  surface  in  the  orthogonal 
distance  problem),  giving  Si  as  a function  of  a.  Methods  based  on  Gauss-Newton  steps 
are  developed  for  the  parametric  case  in  [19],  [20],  and  for  the  implicit  case  in  [7]. 

By  way  of  illustration,  the  2 data  sets  previously  considered  in  Examples  1 and  2 
are  used  to  fit  ellipses  defined  implicitly  with  a particular  choice  of  directions  Vj.  The 
initial  (circles)  and  final  ellipses  (together  with  the  data  points  and  the  directions  v,;) 
are  shown  in  Figures  3 and  4.  The  calculations  needed  respectively  19  and  17  iterations, 
reflecting  the  fact  that,  unlike  the  li  and  loo  cases,  the  convergence  rate  is  linear. 
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Fig.  3.  I2  fit  to  Spath  data  set:  fixed  v^. 


3.2  Using  angular  information 

Berman  and  Griffiths  [2,  3]  consider  fitting  a circle  when  angular  differences  between 
successively  measured  data  points  are  known,  with  applications  in  physics  and  archae- 
ology. This  fitting  problem  has  been  extended  to  the  case  of  ellipses  and  ellipsoids  by 
Spath  in  [14,  15]  and  it  is  this  kind  of  problem  which  is  of  interest  here.  The  methods 
of  [14]  and  [15]  are  based  on  the  alternating  algorithm,  and  while  this  can  be  perhaps 
surprisingly  effective  (particularly  with  a reparameterization  of  the  problem),  we  con- 
sider here  a correct  separated  Gauss-Newton  method  similar  to  that  used  before.  In 
addition  to  (usually)  better  local  convergence  properties,  standard  step-length  control 
can  be  incorporated. 
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To  illustrate,  consider  fitting  an  ellipse  in  general  position.  It  is  convenient  to  do  this 
by  allowing  the  data  to  rotate,  and  fitting  to  those  an  ellipse  in  normal  position,  aligned 
with  the  axes.  Let  {x,y)  denote  the  components  of  x.  Then  we  work  with  the  data 

Xi{(j))  = Xi  cos  4)  + yi  sin  (f),  yi{(j))  = -Xi  sin  4>  + Vi  cos  (f), 

for  i = 1, . . . ,m,  where  0 is  an  unknown  parameter.  Therefore  we  require  to  minimize, 
with  respect  to  the  6 parameters  a,  b,p,  q,  a,  (f>,  the  function 

m 

^{(xi{4i)-a-pcos{a  + ti))^ + {yi{(t))-b-qsin{a  + ti))^}, 

i=l 

where  the  numbers  U are  given.  Because  {a  + U+i)  - (a  + U)  = tj+i  - t,,  for  each  i,  we 
can  interpret  this  as  saying  say  that  the  angular  differences  are  known,  with  a degree  of 
freedom  given  by  the  parameter  a.  Note  that  at  a solution  to  this  problem,  the  directions 
between  pairs  of  points  {xi{(j>),yi{4>))  and  the  corresponding  points  on  the  ellipse  will 
not  generally  be  orthogonal  to  the  ellipse. 

Differentiating  the  above  expression  with  respect  to  a,p,  b,  q gives 

Ai\l]=c,  (3.1) 

where 

4 = [ ^ EHlCOs(«  + ^i)  1 p = [ 1. 

^ [ EIliCOs(a  + ii)  Efci  + <<)  J ’ ^ [ Eilia:i(0)cos(o  + fj)  J ’ 

A2  \ =C2,  (3.2) 

where 

A ^ EfeiSin(a  + t<)  1 _ E*™il/tW 

^ [ Efeisin(a  + fi)  Eilisin^(tt  + ^i)  J ’ ^ [ Efei  sin(a  + 1^)  . ‘ 

Then  (3.1)  and  (3.2)  give  (a,b,p,q)  as  functions  of  a and  <f>,  provided  that  Ai  and  A2 
are  nonsingular:  this  will  be  assumed.  For  given  a and  (j>,  we  can  therefore  define  the 
function  to  be  minimized  as 

F(a,0)  = ||<5(q:,(/»)||, 

where  . 

<5i  = llwill,  i = l,...,m,  (3.3) 

with 

Wi  = {xi{(f))-a-pcos{a  + ti),  yi{4))-b-qsin{a  + ti))'^, 

and  with  a,b,p,q  defined  by  (3.1)  and  (3.2).  Then  we  can  apply  the  Gauss-Newton 
method  to  the  minimization  of  F{a,  <j>).  The  basic  step  d = (<5q,  is  given  by  finding 

min  ||(5  -I-  Jd||,  (3.4) 
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where  J S hag  row  given  by 


ef J = Va,0(Si(a,^),  i 


where 


M = Vo 


0 


It  is  easy  to  compute  M from  (3.1)  and  (3.2)  which  can  be  interpreted  as  identities  in 
a and  (f>.  The  details  are  omitted,  but  all  the  linear  systems  use  just  the  matrices  Ai 
and  A2,  and  apart  from  the  solution  of  (3.4)  (a  least  squares  problem  in  two  variables), 
there  remains  only  evaluation  of  expressions. 

Example  3.1  Consider  Example  1 from  [14],  which  has  m = 11.  Starting  from  a = 
0,  ^ = 0,  15  iterations  are  required  to  satisfy  the  stopping  criterion  ||d||oo  < 0.001. 
The  resulting  value  of  ||<5|p  is  7.7211,  with  a = 2.1253,  b = -0.1700,  p = 4.1281,  q = 
3.0931,  a = 13.2348°,  </)  = 34.7309°. 

Example  3.2  Next  consider  Example  2 from  [14],  which  has  m = 8.  Again  starting 
from  a = 0,  0 = 0,  9 iterations  are  required  to  satisfy  the  stopping  criterion  ||d||oo  < 

0.001.  The  resulting  value  of  ||<5||^  is  4.4946,  with  a = 4.3608,  b = 1.9537,  p = 5.3717,  q = 
3.3704,  a = -0.6215°,  0 = 26.3889°. 


4 Conclusions 

We  have  examined  some  aspects  of  fitting  curves  and  surfaces  to  given  data.  The  un- 
derlying criterion  involves  associating  with  each  data  point  a point  on  the  surface  and 
minimizing  some  norm  of  the  vector  whose  components  are  the  distances  between  pairs 
of  points.  The  distances  can  be  orthogonal  to  the  surface,  or  fixed  in  some  other  way.  But 
the  problems  have  in  common  that  methods  based  on  separated  Gauss-Newton  steps  can 
readily  be  developed. 
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